Task: Define Central Starting Points (AST)

Method of operation

A good starting point is of essential importance for the sake of being able to (re)test. This will contain everything necessary to prepare the test object and the test environment before starting with the test cases in the test script. This involves not only the test data required for the processing, but also the condition in which the system and its environment should be. It relates to, for example, the setting of a certain system date or the running of certain weekly and monthly batches that put the system into a particular condition.

In practice, incorrect starting points appear to be a significant source of problems for the testing. To avoid testing using the wrong starting points during the test execution, it should be considered at an early stage how these are to be constructed and which process is to be employed in using them. If this is not done, the following problems may arise:

Non-reproducible test results - If a test script is executed twice on the same version of the test object and the results vary, this may be the result of divergent test data in the starting point. Extra data may have been added to or removed from the starting point for other tests.
Deteriorating starting point - During the test execution, test data are used and amended. New data come into the system; existing data are amended or perhaps even removed. If no process exists to manage the starting point, nothing is known regarding its quality.
Testing gets increasingly expensive - If the starting point is of poor quality and is not documented anywhere, the testers are obliged to make increasing efforts (in seeking or creating test data) for the execution of the test cases. Moreover, the risk of mistakes on the part of the tester increases. This will increase further in time, as the starting point becomes increasingly less well known and therefore poorer.
Insufficient information on defects causes delay - The starting point takes an important place in the reporting of a defect. It clarifies a defect. If this starting point is not known during the analysis of the defect, delay will result. Developers themselves have to go in search of the original starting point or have to ask the tester for clarification.

In the test specifications, the necessary starting point is specified per test script. To avoid redundancy and to restrict the number of physical files needed, one or more central starting points are defined that the testers can use in the creation of their test cases.

The creation of central starting points can take place in parallel with the setting up of the test specifi cations and is often an iterative process. Often, a tester will start with a central starting point by, for example, proposing the contents of master files. Master files are data that drive the system, but are not part of the primary data processing. Examples are discount tables, tax percentages, postcode tables, product types and customer types. A subsequent step may be to propose an initial content of primary data, e.g. a number of customers, products, orders and invoices. It may be decided to define several central starting points, if this appears to be useful in specifying the tests. The difference may be the type of data, e.g. the one central starting point with all kinds of variations in customers, and the other with all kinds of variations in orders. Another possibility is a difference in time. For example, a central starting point could be defined just before the year’s end and just before disbursement of holiday pay, since these are significant testing points.

In addition, all kinds of starting points emerge in the creation of the test specifications, usually one per test script. The tester who manages the central starting point will consult on this with the tester of the starting point of the script as regards which data are suitable for adding to the central starting point. In this, the following criteria, for example, could be used:

Can other testers reuse (part of) the starting point of the test script?
Does the starting point of the script conflict with the (consistency of) the central starting point?
Can including the starting point of the script in the central starting point disrupt other tests?
Will including the starting point of the script in the central starting point lead to efficiency benefits in the execution of the script?

There are various possibilities for loading the central starting point with test data.

The description of the central starting points is created in accordance with the established norms and standards for testware and taken under configuration management after completion.

Naming test data

A point of focus when creating your own physical test data is the business of naming. It may be decided to name the data similar to those in production. In that case, realistic (although fictional) names are given to e.g. test customers, test addresses, test codes, test products, etc.

It may also be decided to give the data a name that is relevant to the test, for example by including the test-case number, test unit, object part or test goal in the name. This will also help with the solving of defects and transfer to other testers.

The third option is to generate meaningless names. For the foregoing example of test customers, then, these would be:

Person1
Person2
Person3
Person4
Etc.

This last option saves time in searching for and creating realistic or test-related names, but also involves a risk. It may cause a certain functionality or other characteristic of the system to respond differently. Examples are the operation of the sorting algorithm (which is now fairly simple and therefore cannot be extensively tested), long names of individuals or letters with accents. Another example is performance. On a table with 1,000 fictitious names that are numbered consecutively, the database management system might treat them differently from a table with 1,000 fictional names. The so-called index on a table may be differently constituted, which may be detrimental to performance.

Entering test data

There is a choice of three possibilities for the entering of test data:

1. Entering through regular system functions
2. Entering through separate front-end software
3. Use of production data.

1. Entering through regular system functions

Entering test data through regular system functions has the disadvantage that those functions themselves have often not been exhaustively tested and that the data entered therefore need to be thoroughly checked. The advantage is that during the accumulation of the files, the regular functions are implicitly tested simultaneously and the consistency between the data is guaranteed. A condition, however, is that the input functions need to be delivered first. This should be agreed in advance with the supplier of the software.

2. Entering through separate front-end software

Entering test data through separate front-end software and test files has the risk that the test environment will contain inconsistent or non-permitted situations, since there was no check on the input. This means that technical support is required with the accumulation and, of course, tested front-end software must be available. The advantage is that the files can be accumulated relatively quickly.

3. Use of production data

The use of production data as test data has the advantage that testing can be done with a lot of data, that the files can be built up quickly and that any conversion software is tested implicitly. A disadvantage is that these data show little variation and it can mean a lot of searching for the right variation in starting point data in a test case. Another disadvantage is that it is not always permitted to work with production data (because of privacy legislation or openness to fraud). This makes it necessary to make identifying data unrecognisable. In some cases, a production copy is not frozen for the test, but a new copy is periodically placed in the test environment. The disadvantage of this is that the tests are not directly repeatable, because the production data of each copy are different each time, so that the test result predictions are no longer correct.

Aside from planning and budget difficulties, the first alternative, entering test data through regular system functions, is preferable. If the test team has permission to obtain test files from production, it is also possible to combine the three alternatives. Choose a collection of production data that, for example, contain a particular type of information (customer, order, invoice, etc.). This subset is loaded into the test environment (retaining consistency among the various data). Subsequently, with the aid of regular system functions, changes are made to these data to create the desired starting point.

Use of starting points during the test

The use of the central starting point during the test should be considered in advance. This chiefly concerns the choice between:

1. The cumulative construction of the central starting point (unstructured or structured)
2. Periodic restore with the central starting point (master copy)
3. The parallel use of several versions.

1. The cumulative construction of the central starting point (unstructured or structured)

With cumulative construction, the central starting point grows along with the tests. If this is done in an unstructured way, the testers input new test data as required. This gives the testers much freedom and flexibility, but also has a disadvantage. A variety of testers input their own test data, which can influence the test results of other tests. This can cause a lot of wasted searching time in the analysis of test results. Besides, data will quickly become inconsistent. With the structured variant, the testers make agreements in order to prevent such influences. For example, they may agree that only certain types of test data may be entered or changed, or that test data should be identified so that it can be seen to which tester they belong.

2. The periodic restore with the central starting point (master copy)

A second approach is the regular restoring of the central starting point (also called the ‘master copy’). This is done via a backup-and-restore procedure. A backup is fi rst made of the master. At certain times, the administrator of the master restores it. That may be periodically, for example every day of the week, but also on request, for example after the execution of a test. A special management procedure can provide for the structurally adding of test data to the master. A big advantage is the manageability of the data, but disadvantages are the dependency of the restore point and the extra work to go from the master to the starting point necessary for the test.

3. The parallel use of several versions

A third possibility is the use of several environments with parallel versions of the data. Each tester has his own test environment and starting point(s). Having a central starting point at your disposal may remain useful, but each tester is able to amend it as he wishes in his own environment. A big advantage of this approach is the independence of the tests: disruption by other tests is barely possible, since the tester knows exactly what is in his own starting point. That delivers great savings in time. A disadvantage is that, because of the isolation of the tests, faults in starting points can remain undetected for long periods and integral test aspects are only dealt with at a late stage. Another disadvantage is the extra cost for the required test environments, both in terms of hardware and of administration.

An important condition for this method of operation is good configuration management. This should ensure that the software deliveries and followup deliveries in connection with solved defects are rolled out to every test environment simultaneously. This could be a risk factor.

Test data with outsourcing

A development that is attracting the attention of various legislators is the handling of electronic data during outsourcing. Two subjects warrant special attention here:

Confidentiality of the data used - Increasingly, it is being established in laws or formal guidelines how electronic personal details should be dealt with and how to guarantee that such information remains confidential. When test data are created from production databases, it is necessary in cases of outsourcing that the data is made anonymous, since the data departs the organisation and sometimes the country. Cases are known of employees of the supplier abusing software and data belonging to the outsourcing organisation.
Responsibility for supply of data - Another point of focus is the specifying of the test cases and the necessary 0 data. The supplier sometimes has insufficient subject knowledge to create realistic values himself. Extreme examples are: using postcode tables with a wrong number of numeric positions or setting the VAT percentage at 100%. This can seriously disrupt the execution of tests and also makes checking of the test results extremely difficult. If certain 0 data are important to a good test, agreements should be made concerning who will deliver them, and when.

Products

Test basis defects
A description of the central starting point(s).

Tools

Testware management tool.